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Remarks 

Reconsideration of the application is respectfully requested in view of the foregoing 
amendments and following remarks. With entry of amendments included herein, claims 1-2 and 4-37 
are pending in this application. Claims 1, 9, 23, 31, 36, and 37 are independent. No claims have been 
allowed. Claims 3, 10 and 17 have been previously canceled without prejudice. Claim 37 is rejected 
under 35 U.S.C. 102(b) as being anticipated by USP 5,729,471 to Jain et al. ("Jain"). Claims 1-2, 4-9, 
11-16, and 18-36 are rejected under 35 U.S.C. 103(a) as being unpatentable over Jain in view of USP 
5,612,743 to Lee ("Lee"). 

1. The Patent Office's Method of Citation 

The Office, in its rejection of the claims, uses notation that Applicants find difficult to correlate 

with specific claim language. "It is important for an examiner to properly communicate the basis for a 

rejection so that the issues can be identified early and the applicant can be given fair opportunity to 

reply." MPEP 706.020). 37 CFR 1.104(c)(2) states: 

In rejecting claims for want of novelty or for obviousness, the examiner must cite the best 
references at his or her command. When a reference is complex or shows or describes 
inventions other than that claimed by the applicant, the particular part relied on must be 
designated as nearly as practicable. The pertinence of each reference, if not apparent, 
must be clearly explained and each rejected claim specified. 

Applicants respectfully request that the Examiner "designate as nearly as practicable" "the 
particular part" of each reference relied on. Currently, the Applicants are not able to determine which 
portions of the cited art are meant to teach or suggest specific portions of the claim language. For 
example, for the claim 37 language "means for bundle adjusting the virtual key frames to obtain a 
complete three-dimensional reconstruction of the two-dimensional frames" the Examiner's rejection 
states: "(fig. 12, note the "3d visualization" section is the product of the adjusting of the virtual key 
frames to produce a complete three-dimensional reconstruction of the two-dimensional frame data in 
that if there is not enough known points from key frames, estimates or bundle adjustments were made 
to ascertain the best, possible three-dimensional reconstruction of the two-dimensional frame data to 
yield the 3d visualization)." [Action, page 14, lines 1-10.] Applicants are having difficulty making the 
following mapping: 
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Bundle adjusting virtual key frames: The examiner states: "the 3D visualization section is the 
product of adjusting of the virtual key frames" The applicants respectfully submit that they cannot 
locate the virtual key frames in fig. 12, let alone "adjusting the virtual key frames" and so do not know 
how the "3d visualization section" is a "product of adjusting the virtual key frames." 

As another example, for the claim 1 language "performing three-dimensional reconstruction 
individually for each frame segment derived by dividing the sequence of frames" the rejection states: 
"(fig. 12, note there are multiple "image to ground projection" sections that are used to calculate and 
project an image or a partial model for each segment of that includes three-dimensional occupancy 
estimation for which a 3D map of is generated in an attempt to form a dynamic model: col.21, ln.63 to 
col. 22, ln.7, Jain discloses the obtaining of the feature points within the frames; col. 22, ln.62 to col. 23, 
In. 5 6, Jain discloses the use of equations that includes three dimensional coordinates (x, y, z) that 
includes camera position or pose, camera angle and camera parameter to obtain a partial model or a 
"image to ground projection". [Action, page 16, lines 1-9.] 

Applicants respectfully note that they cannot locate where the claim limitation "performing 
three-dimensional reconstruction" is addressed in the above passage. Similarly, the additional 
limitation "performing three-dimensional reconstruction individually for each frame segment" also 
cannot be located. Similarly, the claim limitation "dividing the sequence of frames" also cannot be 
located in the above passage. 

As one more example, for the claim 1 language "means for calculating a partial model for each 
segment that include three-dimensional coordinates and camera pose for features within the frames of 
the segment, the three-dimensional coordinates and camera pose being derived from the frames of the 
segment" the rejection states "(fig 12, note there are multiple 'image to ground projection' sections that 
are used to calculate and project an image or a partial model for each segment of that includes three- 
dimensional occupancy estimation for which a 3D map of is generated in an attempt to for a dynamic 
model; col.21, ln.63 to col.22, ln.7, Jain discloses the obtaining of the feature points within the frames; 
col. 22, ln.62 to col.23, ln.56, Jain discloses the use of equations that includes three dimensional 
coordinates (x, y, z) that includes camera position or pose, camera angle and camera parameter to 
obtain a partial model or a "image to ground projection". [Action, page 13, lines 10-17.] 

Applicants cannot locate the "partial model" in Fig. 12. The Examiner states "there are 
multiple 'image to ground projection' sections that are used to calculate and project an image or a 
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partial model for each segment. . ." Thus, the Examiner seems to correlate "an image" with the "partial 
model" of claim 1. However, Fig. 12 does not have anything called an "image" within it, other than 
the "image to ground projection" which is used, according to the Examiner, "to calculate and project 
an image or a partial model." Further, applicants are unable to determine which box or boxes in Fig. 
12 that the Examiner might be thinking correlates with the "partial model." 

Applicants respectfully request that the Examiner correlate specific portions of the cited 
references with specific claim language so that applicants, at a minimum, know that they are arguing 
the correct portions of the cited references against specific claim language. 



2. Response to Arguments 

The applicants in their attempt to traverse the Examiner's rejection were left confused as to the 
office's interpretation of both the claims and the references due to the nature of the Examiner's reply. 
Rather than answering the substance of the applicants' arguments, the Examiner has cut and pasted the 
earlier rejection, giving the applicant no feedback on the reasons for rejection of the new arguments. 

"Where the applicant traverses any rejection, the examiner should, if he or she repeats the 
rejection, take note of the applicant's argument and answer the substance of it." [MPEP. 707.07(f), 
emphasis added.] However, as the Examiner has not responded, except to repeat his earlier rejection, 
applicants cannot determine the nature of the Examiner's disagreement with applicants' arguments, 
and so are left wondering how to appropriately reply. Especially considering the difficulties applicants 
are having in understanding which portions of the cited references correspond to specific claim 
language, applicants do not even understand if they are arguing the correct portions of the cited 
references in reference to specific claim language, and the Examiner's response does not give the 
applicants sufficient information to determine if their reply is on track. 

For example, with reference to claim 37, applicants argued in the previous response, filed 
December 8, 2006 [hereinafter previous response], on page 13, with reference to claim 37, that the 
cited Jain reference does not teach or suggest a "partial model." In response, the examiner merely 
pasted his original rejection of that portion of claim 37. [See Action, page 2, lines 8 through 20. 
Compare with the previous office action mailed September 8, 2006, page 12, line 21 through page 13 
line 7.] For another example, the Examiner did not address the applicants' assertion that the "image to 
ground projection" is just a projection of an object onto a different plane and thus is, as far as can be 
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ascertained, a two dimensional object with only location values. [See previous response, page 13, lines 
7-19.] 

Using the same method, of merely pasting the original rejection without even mentioning the 
existence of an argument, the Examiner has failed to answer the substance of many arguments made by 
the applicants, including the following ones: 

• The combination of Jain and Lee was improper. One argument given was that the Examiner's 
proposed modification would change the principle of Jain. [Previous response, page 15, line 17 
to page 16, line 17.] The Examiner does not address this argument, merely summarizing some 
case law and then repeating the motivation to combine the two references originally given in 
the previous office action. [Action, page 3, line 14, to page 4, line 9; compare with the 
previous action, page 16, lines 13-17.] 

• The feature points in Jain and Lee refer to entirely different things, and thus the two references 
cannot be combined. [Previous response, page 24, line 18 to page 25, line 5.] 

• For claim 1 , that Lee fails to teach using the threshold to track the number of feature points. 
[Previous response, page 18, lines 17-27.] 

• For claim 9, that Jain fails to teach including an uncertainty in a virtual frame. [Previous 
response, page 19, line 19 to page 20, line 22.] 

• For claim 23, the Examiner, in his rejection, stated that "Jain does not specifically disclose 
adding the second frame to the segment." [Previous action, page 26, line 4.] However, the 
Examiner failed to provide a reference which did disclose this feature. Applicants argued: 

o the Examiner should provide a reference that taught this feature; [Previous response, 
page 21, lines 22-26.] 

o the modification proposed by the Examiner ("adjusting a key frame") was not disclosed 

[Previous action, page 22, line 22 to page 23, line 8.] 
o Jain actively teaches against the Examiner's proposed modification [Previous action, 
page 23, line 9 to line 21.] 
Applicants respectfully request that the Examiner answer the substance of these arguments so 
that applicants may understand if the correct portions of the cited references are being applied to claim 
language by the applicants, and so that applicants may appropriately reply to the Examiner's rejection. 
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3. The Final Rejection is Improper. 

The MPEP states that a final rejection should include a rebuttal of any arguments raised in the 
applicants reply. [MPEP 706.07.] However, as described in sections 1 and 2, above, applicants have 
not received any rebuttal of any arguments that were raised in the applicant's last reply. Instead, the 
Examiner merely copied and pasted his earlier rejection. Hence, Applicants respectfully request that 
the final rejection be withdrawn, and a new rejection be issued which includes rebuttal of the 
arguments that have been raised in the applicants' replies. 



4. Claim Rejection under 35 USC $ 102 

Claim 37 is rejected under 35 U.S.C. 102(b) as being anticipated by USP 5,729,471 to Jain et 
al. ("Jain"). 

Independent claim 37: 

Amended claim 37 recites: 

An apparatus for recovering a three-dimensional scene from a sequence of two- 
dimensional frames by segmenting the frames, comprising: 

means for dividing the sequence into segments comprising at least two frames 

means for calculating a partial model for each segment that includes three-dimensional 
coordinates and camera pose for features within the frames of the segment, the three- 
dimensional coordinates and camera pose being derived from the frames of the segment; 

means for extracting virtual key frames from each partial model. 

The applied reference Jain fails to teach or suggest many aspects of Applicants' claim 37. For 
instance, Jain does not teach or suggest the amended claim 37 language "means for dividing the 
sequence into segments comprising at least two frames." Jain discloses one key frame being manually 
selected for every thirty frames, and scene analysis being applied to the selected key frame. [Jain, 
23:64-67.] All of the frames between the selected key frames are discarded. The Examiner states: 
"clearly, Jain discloses there are segments within a sequence of frames, otherwise, the ascertainment of 
key frames would not be possible without these segments, where each segment is formed from a 
sequence of 30 frames." [Action, p. 32, lines 3-6.] Applicants respectfully disagree. The 
"ascertainment of key frames" which occur every thirty frames, does not require a segment. Rather, it 
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simply requires counting. Every thirtieth frame is chosen to be a key frame, with no use, and thus no 
segment, for the frames in between the chosen key frames. [Jain, 23:64-24:3.] 

If a key frame of Jain is considered a segment, then the segment consists solely of a single 
frame, as the other 29 frames are not used. Further, Jain teaches away from the language "comprising 
at least two frames" as coordinate values between two feature points are assumed to change linearly 
between two consecutive key frames. Thus there is no need to examine any of the discarded frames 
between key frames. [Jain, 24:2-3.] Additionally, no actions are taught or suggested in Jain that the 
applicants can locate which could be performed for a segment with multiple frames; e.g., "at least two 
frames." 

As pulling out every thirtieth frame to process does not teach or suggest "dividing the sequence 
into segments comprising at least two frames" applicant believes claim 27 not subject to a 102 
rejection and is therefore in condition for allowance. 

Moreover, Jain fails to teach or suggest "calculating a partial model for each segment that 
includes three-dimensional coordinates and camera pose for features within the frames of the segment, 
the three-dimensional coordinates and camera pose being derived from the frames of the segment. " 

The Examiner states that "in column 22, line 62 to column 23, line 56, Jain discloses the use of 
equations that includes three dimensional coordinates (x, y, z) that includes camera position or pose, 
camera angle and camera parameter to obtain a partial model or a "image to ground projection." 
[Action, p. 2, lines 13-16.] Thus, the Examiner correlates the partial model in claim 37 with the 
"image to ground projection" in Figure 12 of Jain, and further ties the image to ground projection to 
the camera position or pose, camera angle and camera parameters. 

However, Jain discusses two separate, uncorrelated, embodiments. The first embodiment is a 
system that can track specific football players in real time using fixed cameras whose location and 
movement is known (the first embodiment). [See Jain 20:21 to 25:24.] This section describes 3-D 
information and camera status, and is cited by the Examiner for the claim language includes three- 
dimensional coordinates and camera pose. 

A second, vastly different embodiment is taught separately, in sections 7 through 9 and in 
figures 12 through 21 . It uses three cameras to map the movement in a courtyard in the Engineering 
School at the University of California, San Diego. [Jain, 25:25-34:50.] This is the embodiment 
described with reference to Fig. 12, and the "image to ground projection" which the Examiner cites to 
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teach or suggest a "partial model." The two embodiments are very different in operation. At the very- 
least, there is no correlation disclosed between the features of the first embodiment and the features of 
the second embodiment. Thus, it cannot be determined what relationship the "three-dimensional 
information" and "camera status" of embodiment one [Jain 22, 4-7] has with figure 12, or even if such 
information is included in the system described by figure 12. 

As the reference cited by the Examiner for the claim language "three-dimensional coordinates 
and camera pose for features within the frames of the segment, the three-dimensional coordinates and 
camera pose being derived from the frames of the segment" and the reference cited by the Examiner 
for the claim language "calculating a partial model for each segment" do not have a clear relationship 
with each other, they cannot be combined. Thus, for this reason, claim 37 is in condition for 
allowance. 

Moreover, the image to ground projection of Jain does not teach or suggest the "partial model" 
of claim 37. The Examiner equates the partial model with the "image to ground projection" in Figure 
12 of Jain. [Action, page 13, line 10]. However, the "image to ground projection" box shown in Fig. 
12 is unmentioned within the specification of Jain. Looking at Figure 12, a box labeled "2D object 
tracking" is shown upstream of the "image to ground projection" box. The "2D object tracking" box 
of Fig. 12 is, likewise, not described in Jain. 

A partial model embodiment is explained, e.g., below. 

For each segment, a partial model is created. For example, for segment 1, a partial model 
70 is shown whereas for segment N, a partial model 72 is shown. The partial model 
contains the same number of frames as the segment it represents. Thus, continuing the 
example above, the partial model 70 contains 100 frames and the partial model 72 
contains 110 frames. The partial models provide three-dimensional coordinates and 
camera pose for the feature points. The partial model may also contain an uncertainty 
associated with the segment. [Specification, p. 8, lines 15-21.] 

Using an otherwise unexplained phrase - "image to ground projection" cannot be said to teach 
or suggest the different claim language "partial model." Moreover, as the "image to ground 
projection" of Jain is not actually taught, in that there is no explanation about how to calculate the 
"image to ground projection" that Applicants can locate, Jain cannot be said to teach or suggest the 
claim language "means for calculating a partial model." 
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Additionally, calculating the partial model requires using "three-dimensional coordinates and 
camera pose for features within the frames of the segment." As Jain discloses only one frame per 
segment, Jain cannot teach or suggest the italicized claim language which requires, at a minimum, 
frames of the segment. 

Moreover, Jain also fails to teach or suggest "means for extracting virtual key frames from each 
partial model" as recited in claim 37. The Examiner cites to the selection of one key frame from every 
thirty frames to teach or suggest this limitation. [Action, page 13, lines 17-20.] However, the 
Examiner has also cited to the same selection of one key frame for every thirty frames to teach the 
limitation "dividing the sequence into segments" [Action, p. 12, \2. line 10 to p. 13, line 6.] Using the 
same cite in Jain, the single frame segment, to teach both segments and the virtual key frame, thus, 
requires that both the "segment" and the "virtual key frame" of claim 37 be the same key frame in Jain. 
However, at a minimum, the "segment" and the "virtual key frame" of claim 37 are different features, 
described with different language. Even if, for argument's sake, we assume that either the "segment" 
or the "virtual key frame" is taught by the Jain cite col. 23-58-24:3, the same cite cannot teach the 
other, different element. Thus, the Examiner has failed to provide a citation in Jain that teaches either 
the "segment" or the "virtual key frame" as required in a 102 rejection. Thus, at least for this reason, 
applicants respectfully request that the 102 rejection be withdrawn. 

Likewise, Jain does not teach or suggest the claim language "means for extracting virtual key 
frames from each partial model" as Jain does not suggest extracting virtual key frames from each 
partial model. The selection of one key frame for every 30 frames is not described as being extracted 
from anything, except possibly the thirty original film frames, and is certainly not described as being 
extracted from "each partial model." [See, Jain 23:58-24:3.] 

Since the cited reference fails to describe several elements recited in claim 37, Applicants 
request the rejection of claim 37 be withdrawn. 

5. Claim Rejections Under 35 USC $ 103 

Claims 1-2, 4-9, 11-16, and 18-36 are rejected under 35 U.S.C. 103(a) as being unpatentable 
over USP 5,729,471 Jain et al. ("Jain") in view of USP 5,612,743 to Lee ("Lee"). Applicants 
respectfully submit that the claims in their present form are allowable over the applied art. To 
establish a prima facie case of obviousness, three basic criteria must be met. First, there must be some 
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suggestion or motivation, either in the references themselves or in the knowledge generally available to 
one of ordinary skill in the art, to modify the reference or to combine reference teachings. Second, 
there must be a reasonable expectation of success. Finally, the prior art reference (or references when 
combined) must teach or suggest all the claim limitations. (MPEP § 2142.) 

A. Jain and Lee, either in combination or separately, fail to teach at least one element 
of claims 1-2, 4-9, 11-16, and 18-36. 



Independent Claim 1 

Amended claim 1 recites as follows: 

dividing the sequence of frames into frame segments comprising at least one base frame 
and at least one next frame wherein the frames in the sequence comprise feature points and 
wherein the sequence of frames is divided into frame segments based upon the at least one next 
frame in each frame segment having at least a minimum number of feature points being tracked 
to the at least one base frame in the frame segment; 

The applied references, Jain and Lee, both individually and in combination, fail to teach or 
suggest many aspects of claim 1 . For instance, neither Jain nor Lee, either separately or in 
combination, teach or suggest the amended claim 1 language "dividing the sequence of frames into 
frame segments comprising at least one base frame and at least one next frame." In the "Response to 
Arguments," the Examiner states: "clearly, Jain discloses there are segments within a sequence of 
frames, otherwise, the ascertainment of key frames would not be possible without these segments, 
where each segment is formed from a sequence of 30 frames." [Action, page 5, lines 1-3.] Applicants 
respectfully disagree. Jain discloses one key frame being manually selected for every thirty frames, 
and scene analysis being applied to the selected key frame. [Jain, 23:64-67.] All of the frames 
between the selected key frames are discarded. Thus, if a key frame of Jain, for arguments' sake, is 
considered a segment, then the segment consists solely of a single frame. Using one frame and 
essentially discarding the next 29 frames does not teach or suggest dividing the sequence of frames into 
frame segments. Rather, it teaches reducing the frames - only l/30 th of the original frames are used in 
Jain. No use for the 29 frames not chosen as key frames is disclosed. 

Further, Jain teaches away from a segment with more than a single frame, as coordinate values 
as contained in each frame are assumed to change linearly between two consecutive key frames, thus 
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there is no need to examine any of the 29 discarded frames between key frames. [Jain, 24:2-3.] Lee, 
also, either separately or in combination with Jain, fails to teach the above-mentioned claim language. 

Moreover, Jain does not teach or disclose a "next frame," as found in the amended claim 
language. One frame out of 30 frames is chosen to be a key frame, and has scene analysis performed 
on it. Even if the key frame of Jain, for argument's sake, is assumed to be a segment, the segment 
would only have a single frame. There is no teaching or suggestion in Jain of a "next frame." As 
using one frame as a key frame and discarding the next 29 frames does not teach or suggest "dividing 
the sequence of frames into frame segments comprising at least one base frame and at least one next 
frame" applicants believe claim 1 is not subject to a 103 rejection and is therefore in condition for 
allowance. 

Additionally, neither Jain nor Lee, either separately or in combination, teach "performing 
three-dimensional reconstruction individually for each frame segment derived by dividing the 
sequence of frames.'" The Examiner cites to the Fig. 12 "image to ground projection" box to teach the 
above claim language, applicants believe. However, as noted with reference to the discussion of claim 
37, above, "image to ground" is not otherwise described in Jain. The mere words "image to ground 
projection" do not teach or suggest "performing three dimensional reconstruction." 

Applicants believe that the Examiner also cites to the "3-D occupancy estimation" in Fig. 12 of 
Jain to teach at least a portion of "performing three-dimensional reconstruction individually for each 
frame segment derived by dividing the sequence of frames." However, the applicants cannot locate the 
phrase "3D occupancy estimation" (or even the word "occupancy") within the specification. As such, 
the nature of "3D occupancy estimation" remains a mystery. 

However, even assuming, for argument's sake, that the Fig. 12 in Jain, above, teaches three 
dimensional reconstruction, neither it nor Fig. 12 or Fig. 13 discuss or suggest a frame segment, let 
alone "performing three-dimensional reconstruction individually for each frame." As such, the above 
explanation does not teach or suggests "performing three-dimensional reconstruction individually for 
each frame segment derived by dividing the sequence of frames." For the reasons listed above, 
applicants believe that claim 1 is in condition for allowance. 

Since the applied references, do not teach or suggest at least one element of claim 1 , claim 1 in 
its present form should be allowed. 
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Dependent claims 2 and 4-8 

Claims 2 and 4-8, ultimately depend on claim 1 and, thus, at least for the reasons set forth 
above with respect to claim 1 , claims 2 and 4-8 should be also be allowed. Furthermore, each of the 
claims 2 and 4-8 also recites independently patentable features and, thus, should be allowed for that 
reason. 

Independent claim 9 

Claim 9 recites as follows: 

A method of recovering a three-dimensional scene from two-dimensional images, 
the method comprising: 

dividing the sequence of frames into segments, . . . 

for each segment, encoding the frames in the segment into at least two virtual 
frames that include a three-dimensional structure for the segment and an uncertainty 
associated with the segment .... 

for each of the at least two chosen frames, projecting a plurality of three- 
dimensional points into a corresponding virtual frame; and 

for each of the at least two chosen frames, projecting an uncertainty into the 
corresponding virtual frame. 

The applied reference Jain fails to teach or suggest many aspects of Applicants' claim 9. For 
instance, Jain fails to teach or suggest "a method of recovering a three-dimensional scene from two- 
dimensional images, the method comprising... dividing the sequence of frames into segments,... for 
each segment, encoding the frames in the segment into at least two virtual frames that include a three- 
dimensional structure for the segment and an uncertainty associated with the segment.. Sox each of the 
at least two chosen frames, projecting a plurality of three-dimensional points into a corresponding 
virtual frame; and for each of the at least two chosen frames, projecting an uncertainty into the 
corresponding virtual frame" as recited in Applicants' claim 9. 

The Examiner relies on the following passage to teach "dividing the sequence of frames into 
segments" 

Ideally the scene analysis process just described should be applied to every video 
frame in order to get the most precise information about (i) the location of players and (ii) 
the events in the scene. However, it would require significant human and computational 
effort to do so in the rudimentary, prototype, MPI video system because feature points are 
located manually, and not by automation. Therefore, one key frame has been manually 
selected for every thirty frames, and scene analysis has been applied to the selected key 
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frames. For frames in between, player position and camera status is estimated by 
interpolation between key frames by proceeding under the assumption that coordinate 
values change linearly between a consecutive two key frames. [Jain 24:58-24:3.] 

The Examiner suggests that the exact passage, [Jain, 23:58-24:3] teaches "encoding the frames 
in the segment into at least two virtual frames ..." 

For the same passage to teach both "segments" and "virtual key frames" the Examiner must be 
assuming that the "segments" and the "virtual key frames" are the same structure in claim 9. However, 
the claim language indicates that they are two different structures, one a "segment" with frames and 
one a "virtual key frame." Further, the frames in the segment (structure 1) are encoded into "at least 
two virtual key frames. . . ." The Examiner has failed to point out how the same feature in Jain teaches 
both segments and "virtual key frames." 

Further, the claim requires "encoding" the frames into "at least two" virtual frames. The 
Examiner has failed to provide a reference that encodes frames into virtual frames. The Examiner has 
also failed to provide a reference that encodes frames into at least two virtual frames. 

Lee, either in combination with Jain or separately, also fails to teach virtual key frames, 
encoding frames into virtual frames, or encoding frames into at least two virtual frames. Thus, for 
these reasons, claim 9 is in condition for allowance. 

Dependent claims 10-22 

Claims 10-22, ultimately depend on claim 9 and thus, at least for the reasons set forth above 
with respect to claim 9, claims 10-22 should be also be allowed. Furthermore, each of the claims 10- 
22 also recites independently patentable features and, thus, should be allowed for that reason. 

Independent Claim 23 

The applied references Jain and Lee fail to teach or suggest many aspects of Applicants' 
claim 23. For instance, Jain and Lee both fail to teach or suggest 

(e) determining a number of the selected feature points from the base frame 
that are also identified in the next frame; and 

(f) if the number of the selected feature points from the base frame that are 
also identified in the next frame is greater than or equal to a threshold number, adding the 
next frame to the first segment of frames of the sequence. 
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The examiner states: "Jain does not specifically disclose the adding the second frame to the 
segment." [Action, p. 26, 1. 4.] Applicants respectfully agree. Applicants also state that Lee does not 
disclose such an element either. Applicants respectfully suggest that, at a minimum, an obviousness 
rejection should include a reference where the cited language is taught. "To establish prima facie 
obviousness of a claimed invention, all the claim limitations must be taught or suggested by the prior 
art. In re Royka, 490 F.2d 981, 180 USPQ 580 (CCPA 1974)." MPEP 2143.03. As the Examiner has 
not provided a reference for the claim limitation "adding the next frame to the first segment of frames 
of the sequence." Applicants respectfully state that for the above reasons, a 103 rejection is improper 
and that claim 23 is in condition for allowance. 

Jain, also, does not suggest the above claim limitation. The Examiner states that "it would have 
been obvious to one of ordinary skill in the art to manually change the number of key (representative 
frames per segment from anywhere between two to five key or representative frames per segment if 
necessary for accurately enhancing the three-dimensional representation of the targeted scene." 
Applicants respectfully disagree. Jain teaches selecting one frame from each 30 frames of film and 
performing scene analysis on that frame. [Jain, 23:57-24-3.] Jain doesn't mention any use for the 
other 29 frames. Thus, Jain teaches, at the most, of using one of every thirty frames for scene analysis. 
If, for argument sake, the one frame used for scene analysis is considered a segment, Jain teaches a 
segment that consists of at most a single frame. Thus, Jain teaches away from creating a segment that 
has more than a single frame, because there is no process, method, etc. that can utilize a segment of 
two frames. 

Since the applied references do not teach or suggest at least one element of claim 23, claim 23 
in its present form should be allowed. 

Dependent claims 24-30 

Claims 24-30 depend on claim 23 and thus, at least for the reasons set forth above with respect 
to claim 23, claims 24-30 should be allowed. Furthermore, each of the claims 24-30 also recites 
independently patentable features and, thus, should be allowed for that reason. 
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Independent claim 31 

Claim 31 recites as follows: 

dividing a long sequence of frames into segments and reducing the number of 
frames in each segment by representing the segments using between two and five 
representative frames per segment. . . . 

The applied references Jain and Lee fail to teach or suggest many aspects of Applicants' 
claim 3 1 . For instance, Jain and Lee both fail to teach or suggest "dividing a long sequence of frames 
into segments and reducing the number of frames in each segment by representing the segments using 
between two and five representative frames per segment . " 

The Examiner states that "Jain does not specifically disclose the reducing the number of frames 
in each segment by representing the segments using between two and five representative frames per 
segment." [Action at p. 29, lines 5-7.] Applicants agree. The Examiner then states: "However, Jain 
discloses the manual adjustment of the number of key frames, where the number is one key frame for 
every thirty frames, i.e., a segment. Therefore, since Jain teaches the manual adjustment of one key 
frame or representative frame for every thirty frames, it would have been obvious to one of ordinary 
skill in the art to manually change the number of key (representative) frames per segment from 
anywhere between two to five key or representative frames per segment if necessary for accurately 
enhancing the three-dimensional representation of the targeted scene." [Action at p. 29, lines 7-14.] 
Applicants respectfully disagree. 

The Examiner has not provided a reference which teaches "reducing the number of frames in 
each segment by representing the segments using between two and five representative frames per 
segment " as recited in claim 3 1 . Applicants respectfully suggest that, at a minimum, an obviousness 
rejection should include a reference where the cited language is taught. "To establish prima facie 
obviousness of a claimed invention, all the claim limitations must be taught or suggested by the prior 
art. In re Royka, 490 F.2d 981, 180 USPQ 580 (CCPA 1974)." MPEP 2143.03. Since the cited 
references do not teach or suggest at least the cited portions of claim 3 1 , Applicants respectfully 
suggest that this claim is in condition for allowance. 

Furthermore, the Examiner has provided no reference for the claim language "reducing the 
number of frames in each segment ..." As each segment comprises at most one frame, (the key frame) 
with the other 29 frames discarded, it is difficult to see how such segments could be reduced. As 
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mentioned previously, the Examiner must provide a reference which teaches or suggests each 
limitation in the claims. MPEP 2143.03. As the Examiner has failed to do so, applicants respectfully 
note that claim 3 1 is in condition for allowance. 

Moreover, neither Jain or Lee disclose "ending a segment whenever a frame does not contain a 
predetermined threshold of feature points that are contained in the base frame." The Examiner does 
not provide a reference for the above-quoted claim language. Rather, the Examiner states that "Jain 
does not disclose a predetermined threshold of feature points that are contained in the base frame" and 
that "Lee teaches the predetermined threshold of feature points that are contained in the base frame." 
[Action, page 29, lines 22-23.] Even, if for argument's sake, we assume that Lee does disclose 
predetermined feature points contained in a base frame, this neither teaches nor suggests the additional 
limitations "ending a segment whenever a frame does not contain a predetermined threshold of feature 
points that are contained in the base frame." The ending of segments is not mentioned in either Lee or 
Jain. As ending segments is not taught, the additional limitations "ending a segment whenever a 
frame does not contain a predetermined threshold of feature points that are contained in the base 
frame" are also not taught or suggested. As a 103 rejection requires, at a minimum, that all limitations 
be taught or suggested, claim 31 is in condition for allowance. 

For all of the reasons mentioned above, claim 3 1 is in condition for allowance. 



Dependent claims 32 - 35 

Claims 32 -35 depend on claim 3 1 and, thus, at least for the reasons set forth above with respect 
to claim 31, claims 32 - 35 should be also be allowed. Furthermore, each of the claims 32 - 35 also 
recites independently patentable features and, thus, should be allowed for that reason. 

Independent claim 36 

Claim 36 recites as follows: 

A computer-readable medium having computer-executable instructions for performing a 
method comprising: 

providing a sequence of two-dimensional frames; 
dividing the sequence into segments; 

calculating a partial model for each segment, wherein the partial model includes 
the same number of frames as the segment said partial model represents and wherein the 
partial model includes three-dimensional coordinates and camera pose, the camera pose 
comprising rotation and translation, for features within the frames; 
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extracting virtual key frames from each partial model, the virtual key frames 
having three-dimensional coordinates for the frames and an uncertainty associated with 
the frames; and 

bundle adjusting the virtual key frames to obtain a complete three-dimensional 
reconstruction of the two-dimensional frames. 

The reference to Jain fails to teach or suggest many aspects of Applicants' claim 36. For 
instance, Jain fails to teach or suggest "dividing the sequence into segments" as discussed with 
reference to claims 37 and claim 1 . Jain also fails to teach or suggest calculating a partial model for 
each segment as discussed with reference to claim 37. Jain also fails to discuss extracting virtual key 
frames from each partial model, as discussed with reference to claim 37. 

Further, Jain fails to teach or suggests "bundle adjusting the virtual key frames to obtain a 

complete three-dimensional reconstruction of the two-dimensional frames." Bundle adjusting is 

described in the specification as follows: 

Bundle adjustment is a non-linear minimization process that is typically applied to 
all of the input frames and features of the input image stream. Essentially, bundle 
adjustment is a non-linear averaging of the features over the input frames to obtain the 
most accurate 3D structure and camera motion. ). [Specification, page 2, lines 19-22.] 

In box 160, bundle adjustment is performed on the SFMs. Bundle adjustment 
performs simultaneous optimization of 3D point and camera placements by minimizing 
the squared error between estimated and measured image feature locations. There are 
two general approaches to bundle adjustment. The first interleaves structure and motion 
estimation stages as described in "Optimal motion and structure estimation" by J. Weng 
et al. (IEEE Transactions on Pattern Analysis and Machine Intelligence, Sept. 93) and 
"Geometrically constrained structure from motion: Points on planes", by R. Szeliski et 
al. (European Workshop on 3D structure from multiple images of Large-scale 
Environments, June 1998). The second simultaneously optimizes for structure and 
motion "Euclidean reconstruction from uncalibrated views" by R.I. Hartley (Second 
European Workshop on Invariants, Oct. 1993). [Specification, page 15, lines 5-15.] 

The Examiner states "bundle adjusting the virtual key frames to obtain a complete three- 
dimensional reconstruction of the two-dimensional frames (fig. 12, note the "3D visualization" section 
is the product of the adjusting of the virtual key frames to produce a complete three-dimensional 
reconstruction of the two dimensional frames obtained by video camera 1 to video camera N; also, col. 
24, In. 38-67, Jain discloses the key frames are used to obtain the best possible three dimensional 
reconstruction of the two-dimensional frame data in that if there is not enough known points from key 
frames, estimates or bundle adjustments were made to ascertain the best, possible three-dimensional 
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reconstruction of the two-dimensional frame data to yield the 3D visualization)." [Action, page 33, 
lines 7-15.] 

Applicants respectfully disagree. The Examiner uses the same reference to both teach 
"dividing the sequence into segments" and "extracting virtual key frames from each partial model." 
[Jain, 23:58 - 24:3.] However, virtual key frames are not synonymous with segments, and at a 
minimum cannot be taught by the same reference. Further, the Examiner indicates that "virtual key 
frames" are taught by Fig. 12. The Examiner states '"3D visualization' section is the product of the 
adjusting of the virtual key frames to produce a complete three-dimensional reconstruction of the two 
dimensional frames obtained by video camera 1 to video camera N;" First, virtual key frames are not 
disclosed, either in Figure 12 or in col. 24, In. 38-67. )." [Action, page 33, lines 1-15.] However, the 
Examiner has not shown where in Fig. 12, the "virtual key frames" can be found, or how "adjusting" 
the virtual key frames produces the "3D visualization". Applicants respectfully request clarification. 

Jain also does not teach or suggest "bundle adjustment.. The Examiner states "Jain discloses 
the key frames are used to obtain the best possible three dimensional reconstruction of the two- 
dimensional frame data in that if there is not enough known points from key frames, estimates or 
bundle adjustments were made to ascertain the best, possible three-dimensional reconstruction of the 
two-dimensional frame data to yield the 3D visualization)." Thus, the Examiner is stating, we believe, 
that estimates disclosed in Jain at 24:51-67 teach or suggest "bundle adjusting." Applicants 
respectfully disagree. 

In Jain, scene analysis is applied to key frames. The scene in question is a football game being 
played on a field with field markings. Applicants believe that the feature points in Jain placed on the 
key frames are the known locations of the field markings. In the scene analysis, references to at least 
three field marks on the frames were used as "known points" to determine camera status. [Jain, 24:38- 
43, Fig. 9a.] Some frames, however, didn't show the football field itself, and so the field mark 
locations on the field for those frames had to be "estimated", e.g., guessed. [See Fig. 9b for a slide 
which doesn't show the field, and thus cannot provide accurate field markings.] It appears that feature 
points were added at the presumed location of the field markings. The estimated location (i.e., a 
feature point) was then used to determine camera status. [Jain, 24:51-67, Fig. 9b.] Estimating a 
location of a field mark on a video slide of a football game does not teach or suggest "bundle 
adjusting." Even if we assume that the "known points" of Jain are feature points, Jain teaches only 
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adding feature points to a key frame at estimated locations. Bundle adjustment is "a non-linear 
minimization process" at a minimum, that has nothing to do with adding feature points to frames. 

Further, the estimation in Jain is performed on the original video key frame. [Jain, 24:5 1-67.] 
The Examiner correlates this video frame with the "segments" of "means for dividing the sequences 
into segments" as previously discussed. Even if, for the sake of argument, we associate Jain's key 
frames with segments consisting of a single frame, the segments are not virtual key frames, as 
previously discussed. Thus, Jain also does not teach or suggest "bundle adjusting the virtual key 
frames." 

Thus, Jain does not teach or suggest many features of claim 36. The reference to Lee, either 
separately or in combination with Jain, also fails to teach or suggest the language of claim 36. Thus, as 
Jain and Lee fail to teach or suggest at least one element of claim 36, claim 36 in its present form 
should be allowed. 



Request for Interview 

If any issues remain, the Examiner is formally requested to contact the undersigned attorney 
prior to issuance of the next Office Action in order to arrange a telephonic interview. It is believed that 
a brief discussion of the merits of the present application may expedite prosecution. Applicants submit 
the foregoing formal Amendment so that the Examiner may fully evaluate Applicants' position, 
thereby enabling the interview to be more focused. 

This request is being submitted under MPEP § 713.01, which indicates that an interview may 
be arranged in advance by a written request. 
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Conclusion 

The claims in their present form should now be allowable. Such action is respectfully 
requested. 



One World Trade Center, Suite 1600 
121 S.W. Salmon Street 
Portland, Oregon 97204 
Telephone: (503) 595-5300 
Facsimile: (503) 595-5301 



Respectfully submitted, 
KLARQUIST SPARKMAN, LLP 



By /Genie Lyons/ 

Genie Lyons 
Registration No. 43,841 
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